[Feature] Add SkyPilot examples #422

nuzant · 2025-10-09T09:58:11Z

This pull request adds comprehensive support and documentation for running AReaL experiments with SkyPilot on cloud and Kubernetes infrastructures. It introduces example YAML configurations for both single-node and multi-node experiments, a detailed README for SkyPilot usage, and step-by-step installation instructions. These changes make it much easier to launch distributed AReaL experiments on GCP or Kubernetes using SkyPilot.

SkyPilot Integration and Documentation

Added a new section to docs/tutorial/installation.md with step-by-step instructions for installing and verifying SkyPilot, including GCP and Kubernetes setup guidance.
Created examples/skypilot/README.md providing detailed usage examples, explanations, and command lines for running AReaL experiments with SkyPilot, covering both single-node and multi-node setups.

Example Configurations for SkyPilot

Added examples/skypilot/local.yaml as a template for launching a single-node AReaL experiment with SkyPilot on GCP, specifying resources, storage, and launch commands.
Added examples/skypilot/ray_cluster.yaml for launching a multi-node Ray cluster with SkyPilot, including setup for distributed training and shared storage.
Added examples/skypilot/gsm8k_grpo_ray.yaml as a sample AReaL experiment configuration for Ray-based distributed training, detailing experiment parameters and resource allocation.

UPDATE: Separated examples and launcher into 2 PRs: #464

nuzant · 2025-10-09T10:42:33Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces first-class support for SkyPilot, enabling AReaL experiments to be run on cloud and Kubernetes infrastructure. The changes include a new SkyPilotLauncherConfig, the SkyPilotLauncher implementation, and extensive documentation and examples.

The overall implementation is solid and follows SkyPilot's best practices. The new launcher is well-structured, handling cluster provisioning, job submission, and state management correctly. The documentation is also comprehensive and will be very helpful for users.

I've found a few issues that should be addressed:

There are hardcoded network ports in the launcher, which could cause conflicts.
There's a bug in the calculation of trainer nodes, leading to incorrect resource allocation.
The example ray_cluster.yaml and its corresponding documentation contain a shell script with syntax errors and a logic bug that would cause worker nodes to terminate prematurely.

Addressing these points will improve the robustness and correctness of the SkyPilot integration. Great work on adding this powerful feature!

examples/skypilot/ray_cluster.yaml

areal/launcher/skypilot.py

examples/skypilot/README.md

alex000kim · 2025-10-11T02:29:20Z

examples/skypilot/README.md

+   future launches.
+
+```bash
+sky volumes apply storage-volume.yaml


You need to make it clear from where the user needs to execute steps from this README.
Here it assumes examples/skypilot, but later it assumes the root of the repo

I think we could make contents about cloud buckets and volumes shorter, and refer to SkyPilot cloud bucket and volume guide.

Also, I have checked other places to ensure that users can execute these commands in the root of the repo.

alex000kim · 2025-10-11T02:34:46Z

examples/skypilot/README.md

+  /storage: areal-shared-storage
+
+setup: |
+  pip3 install -e .


Maybe worth creating a virtual env instead of installing with pip as root

AReaL repo root directory as workdir and our image ensure that we do not need pip install -e . (or any other installation) before launching the experiment. Therefore setup section here is removed.

alex000kim · 2025-10-11T02:35:57Z

examples/skypilot/README.md

+
+```bash
+export WANDB_API_KEY=<your-wandb-api-key>
+sky launch -c areal --secret WANDB_API_KEY examples/skypilot/ray_cluster.yaml


This command fails for me with this:

(head, rank=0, pid=4232) Executing training script on head node... (worker1, rank=1, pid=3359, ip=10.170.27.163) Node setup complete for rank 1. (head, rank=0, pid=4232) 2025-10-11 02:34:06,774 WARNING services.py:394 -- Found multiple active Ray instances: {'10.156.61.243:6380', '10.156.61.243:6379'}. Connecting to latest cluster at 10.156.61.243:6379. You can override this by setting the `--address` flag or `RAY_ADDRESS` environment variable. (head, rank=0, pid=4232) 2025-10-11 02:34:06,774 INFO worker.py:1771 -- Connecting to existing Ray cluster at address: 10.156.61.243:6379... (head, rank=0, pid=4232) 2025-10-11 02:34:06,785 INFO worker.py:1942 -- Connected to Ray cluster. View the dashboard at http://127.0.0.1:8265 (head, rank=0, pid=4232) Traceback (most recent call last): (head, rank=0, pid=4232) File "<frozen runpy>", line 198, in _run_module_as_main (head, rank=0, pid=4232) File "<frozen runpy>", line 88, in _run_code (head, rank=0, pid=4232) File "/root/sky_workdir/areal/launcher/ray.py", line 591, in <module> (head, rank=0, pid=4232) main() (head, rank=0, pid=4232) File "/root/sky_workdir/areal/launcher/ray.py", line 330, in main (head, rank=0, pid=4232) config, _ = parse_cli_args(sys.argv[1:]) (head, rank=0, pid=4232) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (head, rank=0, pid=4232) File "/root/sky_workdir/areal/api/cli_args.py", line 1308, in parse_cli_args (head, rank=0, pid=4232) cfg = hydra_compose( (head, rank=0, pid=4232) ^^^^^^^^^^^^^^ (head, rank=0, pid=4232) File "/usr/local/lib/python3.12/dist-packages/hydra/compose.py", line 38, in compose (head, rank=0, pid=4232) cfg = gh.hydra.compose_config( (head, rank=0, pid=4232) ^^^^^^^^^^^^^^^^^^^^^^^^ (head, rank=0, pid=4232) File "/usr/local/lib/python3.12/dist-packages/hydra/_internal/hydra.py", line 594, in compose_config (head, rank=0, pid=4232) cfg = self.config_loader.load_configuration( (head, rank=0, pid=4232) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (head, rank=0, pid=4232) File "/usr/local/lib/python3.12/dist-packages/hydra/_internal/config_loader_impl.py", line 142, in load_configuration (head, rank=0, pid=4232) return self._load_configuration_impl( (head, rank=0, pid=4232) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (head, rank=0, pid=4232) File "/usr/local/lib/python3.12/dist-packages/hydra/_internal/config_loader_impl.py", line 244, in _load_configuration_impl (head, rank=0, pid=4232) parsed_overrides, caching_repo = self._parse_overrides_and_create_caching_repo( (head, rank=0, pid=4232) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (head, rank=0, pid=4232) File "/usr/local/lib/python3.12/dist-packages/hydra/_internal/config_loader_impl.py", line 228, in _parse_overrides_and_create_caching_repo (head, rank=0, pid=4232) parsed_overrides = parser.parse_overrides(overrides=overrides) (head, rank=0, pid=4232) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (head, rank=0, pid=4232) File "/usr/local/lib/python3.12/dist-packages/hydra/core/override_parser/overrides_parser.py", line 99, in parse_overrides (head, rank=0, pid=4232) raise OverrideParseException( (head, rank=0, pid=4232) hydra.errors.OverrideParseException: mismatched input '=' expecting <EOF> (head, rank=0, pid=4232) See https://hydra.cc/docs/1.2/advanced/override_grammar/basic for details (head, rank=0, pid=4232) Node setup complete for rank 0.

It is caused by +trainer_env_vars="WANDB_API_KEY=$WANDB_API_KEY". This is a limitation of hydra, which does not allow = to appear in the command line arguments. Currently, users can only set environment variables in the yaml config file. We are finding workarounds for users to set environment variables in the command lines.

Now I think we just remove WANDB_API_KEY from examples to make it clear and runnable.

alex000kim · 2025-10-11T02:55:08Z

examples/skypilot/README.md

+            --config examples/skypilot/gsm8k_grpo_ray.yaml \
+            experiment_name=<your experiment name> \
+            trial_name=<your trial name> \
+            trainer_env_vars="WANDB_API_KEY=$WANDB_API_KEY"


This is wrong and needs to be replaced with '+launcher.trainer_env_vars="WANDB_API_KEY='$WANDB_API_KEY'"'
otherwise it fails

Same as above.

alex000kim · 2025-10-11T02:56:56Z

docs/tutorial/installation.md

+If `GCP: enabled` or `Kubernetes: enabled` are shown, you're ready to use SkyPilot with
+AReaL. Check [here](../examples/skypilot.md) for a detailed example to run AReaL with
+SkyPilot. For more options and details for SkyPilot, see the official
+[SkyPilot installation guide](https://docs.skypilot.co/en/latest/getting-started/installation.html).
+


We need to link to this page on how to configure K8s with work with SkyPilot: https://docs.skypilot.co/en/latest/reference/kubernetes/kubernetes-setup.html

Added reference in the Kubernetes setup section above.

alex000kim · 2025-10-11T02:58:54Z

examples/skypilot/README.md

+resolve and distributed checkpointing. The following guideline shows how to use SkyPilot
+volumes to setup a high-performance shared storage.
+
+1. **Define the volume.** Create a YAML file describing the volume you want SkyPilot to


While using volumes is fine, this is not required. And using cloud buckets could be simpler: https://docs.skypilot.co/en/latest/reference/storage.html

Added cloud bucket usage in the example.

alex000kim · 2025-10-11T03:00:24Z

docs/tutorial/installation.md

+```
+
+If `GCP: enabled` or `Kubernetes: enabled` are shown, you're ready to use SkyPilot with
+AReaL. Check [here](../examples/skypilot.md) for a detailed example to run AReaL with


This file doesn't exist

Here we changed this link to https://github.com/inclusionAI/AReaL/blob/main/examples/skypilot/README.md to ensure the link is available in our documentation pages after this PR is merged into main.

alex000kim · 2025-10-11T03:01:05Z

docs/tutorial/installation.md

+```bash
+# Ensure your kubeconfig is at ~/.kube/config
+mkdir -p ~/.kube
+cp /path/to/kubeconfig ~/.kube/config


It's unclear where /path/to/kubeconfig comes from

Removed this and referred to skypilot k8s setup guide instead.

alex000kim · 2025-10-11T03:02:10Z

examples/skypilot/README.md

+```yaml
+resources:
+  accelerators: H100:8
+  image_id: docker:ghcr.io/inclusionai/areal-runtime:v0.3.3


It looks like this image is configured to use a custom PyPI index https://pypi.antfin-inc.com/simple.
It doesn't work for me. Here's what I see:

(setup pid=4496) Looking in indexes: https://pypi.antfin-inc.com/simple (setup pid=4496) Obtaining file:///root/sky_workdir (setup pid=4496) Installing build dependencies: started (setup pid=3465, ip=10.170.27.38) Looking in indexes: https://pypi.antfin-inc.com/simple (setup pid=3465, ip=10.170.27.38) Obtaining file:///root/sky_workdir (setup pid=3465, ip=10.170.27.38) Installing build dependencies: started (setup pid=4496) Installing build dependencies: still running... (setup pid=3465, ip=10.170.27.38) Installing build dependencies: still running... (setup pid=4496) Installing build dependencies: still running... (setup pid=3465, ip=10.170.27.38) Installing build dependencies: still running... (setup pid=4496) Installing build dependencies: still running... (setup pid=3465, ip=10.170.27.38) Installing build dependencies: still running... (setup pid=4496) Installing build dependencies: still running... (setup pid=3465, ip=10.170.27.38) Installing build dependencies: still running... (setup pid=4496) Installing build dependencies: still running... (setup pid=3465, ip=10.170.27.38) Installing build dependencies: still running... (setup pid=4496) Installing build dependencies: still running... (setup pid=4496) Installing build dependencies: finished with status 'error' (setup pid=4496) error: subprocess-exited-with-error (setup pid=4496) (setup pid=4496) × pip subprocess to install build dependencies did not run successfully. (setup pid=4496) │ exit code: 1 (setup pid=4496) ╰─> [8 lines of output] (setup pid=4496) Looking in indexes: https://pypi.antfin-inc.com/simple (setup pid=4496) WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fbd80fda300>, 'Connection to pypi.antfin-inc.com timed out. (connect timeout=100.0)')': /simple/setuptools/ (setup pid=4496) WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fbd80bca2d0>, 'Connection to pypi.antfin-inc.com timed out. (connect timeout=100.0)')': /simple/setuptools/ (setup pid=4496) WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fbd80bca480>, 'Connection to pypi.antfin-inc.com timed out. (connect timeout=100.0)')': /simple/setuptools/ (setup pid=4496) WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fbd80bca5d0>, 'Connection to pypi.antfin-inc.com timed out. (connect timeout=100.0)')': /simple/setuptools/ (setup pid=4496) WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fbd80bca780>, 'Connection to pypi.antfin-inc.com timed out. (connect timeout=100.0)')': /simple/setuptools/ (setup pid=4496) ERROR: Could not find a version that satisfies the requirement setuptools>=61.0 (from versions: none) (setup pid=4496) ERROR: No matching distribution found for setuptools>=61.0 (setup pid=4496) [end of output] (setup pid=4496) (setup pid=4496) note: This error originates from a subprocess, and is likely not a problem with pip. (setup pid=4496) error: subprocess-exited-with-error (setup pid=4496) (setup pid=4496) × pip subprocess to install build dependencies did not run successfully. (setup pid=4496) │ exit code: 1 (setup pid=4496) ╰─> See above for output. (setup pid=4496) (setup pid=4496) note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Job 1's setup failed with return code list: [137, 1] ✓ Job finished (status: FAILED_SETUP). command terminated with exit code 100

It does not requires any installation to run experiment now. However the PyPI index is still custom for our public image. We will mark this and fix this problem in our next image release.

alex000kim · 2025-10-11T03:02:44Z

examples/skypilot/README.md

+    echo "Starting Ray head node..."
+    ray start --head --port=6379
+
+    while [ $(ray nodes | grep NODE_ID | wc -l) -lt $num_nodes ]; do


ray nodes command doesn't exist:

(head, rank=0, pid=4484) Usage: ray [OPTIONS] COMMAND [ARGS]... (head, rank=0, pid=4484) Try 'ray --help' for help. (head, rank=0, pid=4484) (head, rank=0, pid=4484) Error: No such command 'nodes'.

Fixed this by using ray status instead.

alex000kim · 2025-10-11T03:03:44Z

examples/skypilot/README.md

+    echo "Executing training script on head node..."
+    python3 -m areal.launcher.ray examples/math/gsm8k_grpo.py \
+            --config examples/skypilot/gsm8k_grpo_ray.yaml \
+            experiment_name=<your experiment name> \


Use envs and secrets to set these as env vars:

envs: EXPERIMENT_NAME: my-areal-experiment TRIAL_NAME: my-trial-name secrets: WANDB_API_KEY: null

and then:

experiment_name=$EXPERIMENT_NAME\ trial_name=$TRIAL_NAME \

alex000kim

I think this PR a bit raw.
The training job doesn't run due to incorrect syntax in several places:

non-existent commands
incorrect parameters
etc.

nuzant · 2025-10-13T04:49:04Z

I think this PR a bit raw. The training job doesn't run due to incorrect syntax in several places:

non-existent commands

incorrect parameters

etc.

Thanks for your review! We have GCP access now and we will be able to test and debug this PR by ourselves. We will start fixing this PR right away.

garrett4wade · 2025-10-20T06:41:22Z

examples/skypilot/gsm8k_grpo_ray.yaml

+  n_nodes: 2
+  n_gpus_per_node: 1


Is n_nodes=2 correct? What should be the desired n_gpus_per_node?

What's the goal of using a 2-node cluster with 1 GPU on each node instead of a single node with 2x, 4x or even 8x GPUs?

I'd recommend using SKYPILOT_NUM_NODES and SKYPILOT_NUM_GPUS_PER_NODE instead. See https://docs.skypilot.co/en/latest/running-jobs/environment-variables.html

I have changed the example to a setting with 2x 8 GPU nodes. And now the cluster.n_nodes and cluster.n_gpus_per_node are set by $SKYPILOT_NUM_NODES and $SKYPILOT_NUM_GPUS_PER_NODE in run field of skypilot yaml file.

garrett4wade · 2025-10-20T06:41:41Z

examples/skypilot/local.yaml

+    --config examples/math/gsm8k_grpo.yaml \
+    experiment_name=gsm8k-grpo \
+    trial_name=trial0 \
+    cluster.n_gpus_per_node=2 \


Move this to yaml, or specify both n_nodes and n_gpus_per_node.

I'd recommend using SKYPILOT_NUM_NODES and SKYPILOT_NUM_GPUS_PER_NODE instead. See https://docs.skypilot.co/en/latest/running-jobs/environment-variables.html

Same as above.

alex000kim · 2025-10-20T13:59:31Z

examples/skypilot/ray_cluster.yaml

+run: |
+  # Get the Head node's IP and total number of nodes (environment variables injected by SkyPilot).
+  head_ip=$(echo "$SKYPILOT_NODE_IPS" | head -n1)
+  num_nodes=$(echo "$SKYPILOT_NODE_IPS" | wc -l)


There's a SKYPILOT_NUM_NODES variable https://docs.skypilot.co/en/latest/running-jobs/environment-variables.html

Now num_nodes are replaced by $SKYPILOT_NUM_NODES.

alex000kim · 2025-10-20T14:01:41Z

examples/skypilot/README.md

+### Running AReaL with Ray Launcher
+
+The following example shows how to setup a ray cluster with SkyPilot and then use AReaL
+to run GRPO with GSM8K dataset on 2 nodes, each with 1 A100 GPU. This example runs on


What's the point of using "2 nodes, each with 1 A100 GPU" instead of a single node with several GPUs?
This training will be much slower due to the slow interconnectivity speed between nodes.

It should be fine as an MVP to show the distributed training : )

Its original purpose is to demonstrate an example of distributed training. I have changed the example to a more practical 2x 8 GPU nodes setting.

alex000kim · 2025-10-20T14:02:45Z

examples/skypilot/README.md

+
+```yaml
+file_mounts:
+  /storage: gs://areal-default


Instead of hard-coding gs, can we use something like

file_mounts: /my_data: source: s3://my-bucket/ # or gs://, https://<azure_storage_account>.blob.core.windows.net/<container>, r2://, cos://<region>/<bucket>, oci://<bucket_name> mode: MOUNT # MOUNT or COPY or MOUNT_CACHED. Defaults to MOUNT. Optional.

as per https://docs.skypilot.co/en/latest/reference/storage.html

alex000kim · 2025-10-20T14:03:04Z

examples/skypilot/README.md

+run: |
+  # Get the Head node's IP and total number of nodes (environment variables injected by SkyPilot).
+  head_ip=$(echo "$SKYPILOT_NODE_IPS" | head -n1)
+  num_nodes=$(echo "$SKYPILOT_NODE_IPS" | wc -l)


replace with an env var

alex000kim · 2025-10-20T14:04:57Z

examples/skypilot/ray_cluster.yaml

@@ -0,0 +1,42 @@
+
+resources:
+  infra: gcp


Let's not hard-code GCP and remove the infra here altogether?
One of the key value props of SkyPilot is the ability to run the same workload on different clouds and k8s.
So the users are able to do this:

sky launch -c mycluster sky.yaml --infra aws sky launch -c mycluster sky.yaml --infra gcp sky launch -c mycluster sky.yaml --infra k8s

I have removed GCP hard-code in the yaml files and added a note on how to use different cloud and k8s.

alex000kim · 2025-10-20T15:05:06Z

docs/tutorial/installation.md


+## (Optional) Install SkyPilot
+
+SkyPilot helps you run AReaL easily on cloud or Kubernetes infrastructures. Below shows


I'd add a link to SkyPilot docs + mention that it supports 17+ clouds

alex000kim · 2025-10-20T15:26:01Z

I'd recommend renaming examples/skypilot/local.yaml to examples/skypilot/local.sky.yaml and examples/skypilot/ray_cluster.yaml to examples/skypilot/ray_cluster.sky.yaml.
Otherwise, it can be confusing trying to distinguish which yamls are SkyPilot tasks and which ones are areal configs.

Also, "local" in local.yaml is confusing since it doesn't run locally.

alex000kim · 2025-10-20T18:16:13Z

examples/skypilot/local.yaml

+num_nodes: 1
+
+file_mounts:
+  /storage: gs://areal-default


It might be worth pointing out that /storage is set in examples/skypilot/gsm8k_grpo_ray.yaml

Added a comment to clarify this.

alex000kim · 2025-10-20T19:50:59Z

I confirm that I was able to run examples/skypilot/local.yaml on K8s

alex000kim · 2025-10-20T20:45:17Z

For me, examples/skypilot/ray_cluster.yaml only worked after adding

config:
  kubernetes:
    pod_config:
      spec:
        containers:
        - securityContext:
            capabilities:
              add:
              - IPC_LOCK

but it might be specific to my cluster since the nodes are connected via infiniband.
The rest of it looks good.

…nto mzy/skypilot

nuzant · 2025-10-21T02:53:44Z

I'd recommend renaming examples/skypilot/local.yaml to examples/skypilot/local.sky.yaml and examples/skypilot/ray_cluster.yaml to examples/skypilot/ray_cluster.sky.yaml. Otherwise, it can be confusing trying to distinguish which yamls are SkyPilot tasks and which ones are areal configs.

Also, "local" in local.yaml is confusing since it doesn't run locally.

Great suggestion! Changed local.yaml to single_node.sky.yaml and ray_cluster.yaml to ray_cluster.sky.yaml.

nuzant · 2025-10-21T02:55:54Z

For me, examples/skypilot/ray_cluster.yaml only worked after adding
config:
  kubernetes:
    pod_config:
      spec:
        containers:
        - securityContext:
            capabilities:
              add:
              - IPC_LOCK
but it might be specific to my cluster since the nodes are connected via infiniband. The rest of it looks good.

Added a note on additional config when using a cluster with infiniband.

nuzant · 2025-10-21T02:58:19Z

Thanks for your review! Please check again if recent changes have addressed your comments. @alex000kim @garrett4wade

nuzant added 3 commits October 9, 2025 17:36

add skypilot launcher and examples

3037f9a

Merge remote-tracking branch 'origin/main' into mzy/skypilot

936c55e

add ray cluster yaml

c30f5b0

nuzant had a problem deploying to AReaL-unittests October 9, 2025 09:58 — with GitHub Actions Error

nuzant changed the title ~~Mzy/skypilot~~ [Feature] Add SkyPilot launcher and examples Oct 9, 2025

gemini-code-assist bot reviewed Oct 9, 2025

View reviewed changes

alex000kim reviewed Oct 11, 2025

View reviewed changes

fix ray cluster example

a617678

nuzant had a problem deploying to AReaL-unittests October 16, 2025 10:59 — with GitHub Actions Failure

fix

b1bece4

nuzant had a problem deploying to AReaL-unittests October 17, 2025 06:06 — with GitHub Actions Failure

nuzant had a problem deploying to AReaL-unittests October 17, 2025 09:50 — with GitHub Actions Failure

garrett4wade reviewed Oct 20, 2025

View reviewed changes

alex000kim reviewed Oct 20, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into mzy/skypilot

3ab5114

alex000kim reviewed Oct 20, 2025

View reviewed changes

fix comments

1b4fc15

nuzant had a problem deploying to AReaL-unittests October 21, 2025 02:39 — with GitHub Actions Error

format

aa9ebbe

nuzant had a problem deploying to AReaL-unittests October 21, 2025 02:40 — with GitHub Actions Failure

nuzant added 3 commits October 21, 2025 10:42

.

7de4c02

Merge branch 'mzy/skypilot' of https://github.com/inclusionAI/AReaL i…

d937968

…nto mzy/skypilot

.

3401d74

nuzant had a problem deploying to AReaL-unittests October 21, 2025 02:48 — with GitHub Actions Failure


		## (Optional) Install SkyPilot

		SkyPilot helps you run AReaL easily on cloud or Kubernetes infrastructures. Below shows

[Feature] Add SkyPilot examples #422

Are you sure you want to change the base?

[Feature] Add SkyPilot examples #422

Uh oh!

Conversation

nuzant commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nuzant commented Oct 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex000kim left a comment

Choose a reason for hiding this comment

Uh oh!

nuzant commented Oct 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nuzant commented Oct 9, 2025 •

edited

Loading